338 research outputs found
Can ground truth label propagation from video help semantic segmentation?
For state-of-the-art semantic segmentation task, training convolutional
neural networks (CNNs) requires dense pixelwise ground truth (GT) labeling,
which is expensive and involves extensive human effort. In this work, we study
the possibility of using auxiliary ground truth, so-called \textit{pseudo
ground truth} (PGT) to improve the performance. The PGT is obtained by
propagating the labels of a GT frame to its subsequent frames in the video
using a simple CRF-based, cue integration framework. Our main contribution is
to demonstrate the use of noisy PGT along with GT to improve the performance of
a CNN. We perform a systematic analysis to find the right kind of PGT that
needs to be added along with the GT for training a CNN. In this regard, we
explore three aspects of PGT which influence the learning of a CNN: i) the PGT
labeling has to be of good quality; ii) the PGT images have to be different
compared to the GT images; iii) the PGT has to be trusted differently than GT.
We conclude that PGT which is diverse from GT images and has good quality of
labeling can indeed help improve the performance of a CNN. Also, when PGT is
multiple folds larger than GT, weighing down the trust on PGT helps in
improving the accuracy. Finally, We show that using PGT along with GT, the
performance of Fully Convolutional Network (FCN) on Camvid data is increased by
on IoU accuracy. We believe such an approach can be used to train CNNs
for semantic video segmentation where sequentially labeled image frames are
needed. To this end, we provide recommendations for using PGT strategically for
semantic segmentation and hence bypass the need for extensive human efforts in
labeling.Comment: To appear at ECCV 2016 Workshop on Video Segmentatio
Deep Depth From Focus
Depth from focus (DFF) is one of the classical ill-posed inverse problems in
computer vision. Most approaches recover the depth at each pixel based on the
focal setting which exhibits maximal sharpness. Yet, it is not obvious how to
reliably estimate the sharpness level, particularly in low-textured areas. In
this paper, we propose `Deep Depth From Focus (DDFF)' as the first end-to-end
learning approach to this problem. One of the main challenges we face is the
hunger for data of deep neural networks. In order to obtain a significant
amount of focal stacks with corresponding groundtruth depth, we propose to
leverage a light-field camera with a co-calibrated RGB-D sensor. This allows us
to digitally create focal stacks of varying sizes. Compared to existing
benchmarks our dataset is 25 times larger, enabling the use of machine learning
for this inverse problem. We compare our results with state-of-the-art DFF
methods and we also analyze the effect of several key deep architectural
components. These experiments show that our proposed method `DDFFNet' achieves
state-of-the-art performance in all scenes, reducing depth error by more than
75% compared to the classical DFF methods.Comment: accepted to Asian Conference on Computer Vision (ACCV) 201
The consequence of excess configurational entropy on fragility: the case of a polymer/oligomer blend
By taking advantage of the molecular weight dependence of the glass
transition of polymers and their ability to form perfectly miscible blends, we
propose a way to modify the fragility of a system, from fragile to strong,
keeping the same glass properties, i.e. vibrational density of states,
mean-square displacement and local structure. Both slow and fast dynamics are
investigated by calorimetry and neutron scattering in an athermal
polystyrene/oligomer blend, and compared to those of a pure 17-mer polystyrene
considered to be a reference, of same Tg. Whereas the blend and the pure 17-mer
have the same heat capacity in the glass and in the liquid, their fragilities
differ strongly. This difference in fragility is related to an extra
configurational entropy created by the mixing process and acting at a scale
much larger than the interchain distance, without affecting the fast dynamics
and the structure of the glass
Joint Learning of Intrinsic Images and Semantic Segmentation
Semantic segmentation of outdoor scenes is problematic when there are
variations in imaging conditions. It is known that albedo (reflectance) is
invariant to all kinds of illumination effects. Thus, using reflectance images
for semantic segmentation task can be favorable. Additionally, not only
segmentation may benefit from reflectance, but also segmentation may be useful
for reflectance computation. Therefore, in this paper, the tasks of semantic
segmentation and intrinsic image decomposition are considered as a combined
process by exploring their mutual relationship in a joint fashion. To that end,
we propose a supervised end-to-end CNN architecture to jointly learn intrinsic
image decomposition and semantic segmentation. We analyze the gains of
addressing those two problems jointly. Moreover, new cascade CNN architectures
for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as
single tasks. Furthermore, a dataset of 35K synthetic images of natural
environments is created with corresponding albedo and shading (intrinsics), as
well as semantic labels (segmentation) assigned to each object/scene. The
experiments show that joint learning of intrinsic image decomposition and
semantic segmentation is beneficial for both tasks for natural scenes. Dataset
and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201
Classical and Quantum Chaos in a quantum dot in time-periodic magnetic fields
We investigate the classical and quantum dynamics of an electron confined to
a circular quantum dot in the presence of homogeneous magnetic
fields. The classical motion shows a transition to chaotic behavior depending
on the ratio of field magnitudes and the cyclotron
frequency in units of the drive frequency. We determine a
phase boundary between regular and chaotic classical behavior in the
vs plane. In the quantum regime we evaluate the quasi-energy
spectrum of the time-evolution operator. We show that the nearest neighbor
quasi-energy eigenvalues show a transition from level clustering to level
repulsion as one moves from the regular to chaotic regime in the
plane. The statistic confirms this
transition. In the chaotic regime, the eigenfunction statistics coincides with
the Porter-Thomas prediction. Finally, we explicitly establish the phase space
correspondence between the classical and quantum solutions via the Husimi phase
space distributions of the model. Possible experimentally feasible conditions
to see these effects are discussed.Comment: 26 pages and 17 PstScript figures, two large ones can be obtained
from the Author
Unsupervised Monocular Depth Estimation for Night-time Images using Adversarial Domain Feature Adaptation
In this paper, we look into the problem of estimating per-pixel depth maps
from unconstrained RGB monocular night-time images which is a difficult task
that has not been addressed adequately in the literature. The state-of-the-art
day-time depth estimation methods fail miserably when tested with night-time
images due to a large domain shift between them. The usual photo metric losses
used for training these networks may not work for night-time images due to the
absence of uniform lighting which is commonly present in day-time images,
making it a difficult problem to solve. We propose to solve this problem by
posing it as a domain adaptation problem where a network trained with day-time
images is adapted to work for night-time images. Specifically, an encoder is
trained to generate features from night-time images that are indistinguishable
from those obtained from day-time images by using a PatchGAN-based adversarial
discriminative learning method. Unlike the existing methods that directly adapt
depth prediction (network output), we propose to adapt feature maps obtained
from the encoder network so that a pre-trained day-time depth decoder can be
directly used for predicting depth from these adapted features. Hence, the
resulting method is termed as "Adversarial Domain Feature Adaptation (ADFA)"
and its efficacy is demonstrated through experimentation on the challenging
Oxford night driving dataset. Also, The modular encoder-decoder architecture
for the proposed ADFA method allows us to use the encoder module as a feature
extractor which can be used in many other applications. One such application is
demonstrated where the features obtained from our adapted encoder network are
shown to outperform other state-of-the-art methods in a visual place
recognition problem, thereby, further establishing the usefulness and
effectiveness of the proposed approach.Comment: ECCV 202
Using Fluorescence Recovery After Photobleaching (FRAP) to study dynamics of the Structural Maintenance of Chromosome (SMC) complex in vivo
The SMC complex, MukBEF, is important for chromosome organization and
segregation in Escherichia coli. Fluorescently tagged MukBEF forms distinct
spots (or 'foci') in the cell, where it is thought to carry out most of its
chromosome associated activities. This chapter outlines the technique of
Fluorescence Recovery After Photobleaching (FRAP) as a method to study the
properties of YFP-tagged MukB in fluorescent foci. This method can provide
important insight into the dynamics of MukB on DNA and be used to study its
biochemical properties in vivo
Inner Space Preserving Generative Pose Machine
Image-based generative methods, such as generative adversarial networks
(GANs) have already been able to generate realistic images with much context
control, specially when they are conditioned. However, most successful
frameworks share a common procedure which performs an image-to-image
translation with pose of figures in the image untouched. When the objective is
reposing a figure in an image while preserving the rest of the image, the
state-of-the-art mainly assumes a single rigid body with simple background and
limited pose shift, which can hardly be extended to the images under normal
settings. In this paper, we introduce an image "inner space" preserving model
that assigns an interpretable low-dimensional pose descriptor (LDPD) to an
articulated figure in the image. Figure reposing is then generated by passing
the LDPD and the original image through multi-stage augmented hourglass
networks in a conditional GAN structure, called inner space preserving
generative pose machine (ISP-GPM). We evaluated ISP-GPM on reposing human
figures, which are highly articulated with versatile variations. Test of a
state-of-the-art pose estimator on our reposed dataset gave an accuracy over
80% on PCK0.5 metric. The results also elucidated that our ISP-GPM is able to
preserve the background with high accuracy while reasonably recovering the area
blocked by the figure to be reposed.Comment: http://www.northeastern.edu/ostadabbas/2018/07/23/inner-space-preserving-generative-pose-machine
Simpler Statistically Sender Private Oblivious Transfer from Ideals of Cyclotomic Integers
We present a two-message oblivious transfer protocol achieving statistical sender privacy and computational receiver privacy based on the RLWE assumption for cyclotomic number fields. This work improves upon prior lattice-based statistically sender-private oblivious transfer protocols by reducing the total communication between parties by a factor for transfer of length messages.
Prior work of Brakerski and Döttling uses transference theorems to show that either a lattice or its dual must have short vectors, the existence of which guarantees lossy encryption for encodings with respect to that lattice, and therefore statistical sender privacy. In the case of ideal lattices from embeddings of cyclotomic integers, the existence of one short vector implies the existence of many, and therefore encryption with respect to either a lattice or its dual is guaranteed to ``lose more information about the message than can be ensured in the case of general lattices. This additional structure of ideals of cyclotomic integers allows for efficiency improvements beyond those that are typical when moving from the generic to ideal lattice setting, resulting in smaller message sizes for sender and receiver, as well as a protocol that is simpler to describe and analyze
- …